Dataset statistics
| Dataset A | Dataset B | |
|---|---|---|
| Number of variables | 12 | 12 |
| Number of observations | 446 | 446 |
| Missing cells | 427 | 412 |
| Missing cells (%) | 8.0% | 7.7% |
| Duplicate rows | 0 | 0 |
| Duplicate rows (%) | 0.0% | 0.0% |
| Total size in memory | 45.3 KiB | 45.3 KiB |
| Average record size in memory | 104.0 B | 104.0 B |
Variable types
| Dataset A | Dataset B | |
|---|---|---|
| Numeric | 5 | 5 |
| Categorical | 4 | 4 |
| Text | 3 | 3 |
| Dataset A | Dataset B | |
|---|---|---|
Age has 95 (21.3%) missing values | Age has 70 (15.7%) missing values | Missing |
Cabin has 331 (74.2%) missing values | Cabin has 342 (76.7%) missing values | Missing |
PassengerId has unique values | PassengerId has unique values | Unique |
Name has unique values | Name has unique values | Unique |
SibSp has 305 (68.4%) zeros | SibSp has 317 (71.1%) zeros | Zeros |
Parch has 341 (76.5%) zeros | Parch has 347 (77.8%) zeros | Zeros |
Fare has 9 (2.0%) zeros | Fare has 8 (1.8%) zeros | Zeros |
Reproduction
| Dataset A | Dataset B | |
|---|---|---|
| Analysis started | 2024-05-07 15:12:04.585484 | 2024-05-07 15:12:08.542699 |
| Analysis finished | 2024-05-07 15:12:08.540292 | 2024-05-07 15:12:12.511061 |
| Duration | 3.95 seconds | 3.97 seconds |
| Software version | ydata-profiling v0.0.dev0 | ydata-profiling v0.0.dev0 |
| Download configuration | config.json | config.json |
PassengerId
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 439.287 | 452.46188 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 3 | 2 |
| Maximum | 885 | 891 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 3 | 2 |
| 5-th percentile | 44.5 | 58.25 |
| Q1 | 216.25 | 222.25 |
| median | 440 | 448 |
| Q3 | 655.5 | 682.5 |
| 95-th percentile | 844.5 | 843.75 |
| Maximum | 885 | 891 |
| Range | 882 | 889 |
| Interquartile range (IQR) | 439.25 | 460.25 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 256.18304 | 260.65243 |
| Coefficient of variation (CV) | 0.58317921 | 0.57607601 |
| Kurtosis | -1.1607818 | -1.2761312 |
| Mean | 439.287 | 452.46188 |
| Median Absolute Deviation (MAD) | 218 | 230 |
| Skewness | 0.028394703 | -0.02010083 |
| Sum | 195922 | 201798 |
| Variance | 65629.751 | 67939.692 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 167 | 1 | 0.2% |
| 125 | 1 | 0.2% |
| 132 | 1 | 0.2% |
| 556 | 1 | 0.2% |
| 117 | 1 | 0.2% |
| 605 | 1 | 0.2% |
| 719 | 1 | 0.2% |
| 74 | 1 | 0.2% |
| 286 | 1 | 0.2% |
| 52 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 823 | 1 | 0.2% |
| 337 | 1 | 0.2% |
| 884 | 1 | 0.2% |
| 677 | 1 | 0.2% |
| 702 | 1 | 0.2% |
| 492 | 1 | 0.2% |
| 123 | 1 | 0.2% |
| 635 | 1 | 0.2% |
| 891 | 1 | 0.2% |
| 405 | 1 | 0.2% |
| Other values (436) | 436 |
| Value | Count | Frequency (%) |
| 3 | 1 | |
| 4 | 1 | |
| 7 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 15 | 1 | |
| 18 | 1 | |
| 20 | 1 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 8 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 22 | 1 | |
| 23 | 1 | |
| 27 | 1 | |
| 29 | 1 |
| Value | Count | Frequency (%) |
| 2 | 1 | |
| 3 | 1 | |
| 4 | 1 | |
| 8 | 1 | |
| 13 | 1 | |
| 14 | 1 | |
| 22 | 1 | |
| 23 | 1 | |
| 27 | 1 | |
| 29 | 1 |
| Value | Count | Frequency (%) |
| 3 | 1 | |
| 4 | 1 | |
| 7 | 1 | |
| 9 | 1 | |
| 11 | 1 | |
| 12 | 1 | |
| 13 | 1 | |
| 15 | 1 | |
| 18 | 1 | |
| 20 | 1 |
Survived
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 0 | |
|---|---|
| 1 |
| 0 | |
|---|---|
| 1 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 2 | 2 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 1 | 0 |
| 2nd row | 0 | 0 |
| 3rd row | 0 | 1 |
| 4th row | 1 | 0 |
| 5th row | 0 | 0 |
Common Values
| Value | Count | Frequency (%) |
| 0 | 264 | |
| 1 | 182 |
| Value | Count | Frequency (%) |
| 0 | 278 | |
| 1 | 168 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 0 | 264 | |
| 1 | 182 |
| Value | Count | Frequency (%) |
| 0 | 278 | |
| 1 | 168 |
Most occurring characters
| Value | Count | Frequency (%) |
| 0 | 264 | |
| 1 | 182 |
| Value | Count | Frequency (%) |
| 0 | 278 | |
| 1 | 168 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 0 | 264 | |
| 1 | 182 |
| Value | Count | Frequency (%) |
| 0 | 278 | |
| 1 | 168 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 0 | 264 | |
| 1 | 182 |
| Value | Count | Frequency (%) |
| 0 | 278 | |
| 1 | 168 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 0 | 264 | |
| 1 | 182 |
| Value | Count | Frequency (%) |
| 0 | 278 | |
| 1 | 168 |
Pclass
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| 3 | |
|---|---|
| 1 | |
| 2 |
| 3 | |
|---|---|
| 1 | |
| 2 |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 446 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 1 | 1 |
| 2nd row | 3 | 2 |
| 3rd row | 3 | 3 |
| 4th row | 2 | 3 |
| 5th row | 3 | 3 |
Common Values
| Value | Count | Frequency (%) |
| 3 | 244 | |
| 1 | 121 | |
| 2 | 81 | 18.2% |
| Value | Count | Frequency (%) |
| 3 | 250 | |
| 1 | 112 | |
| 2 | 84 | 18.8% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| 3 | 244 | |
| 1 | 121 | |
| 2 | 81 | 18.2% |
| Value | Count | Frequency (%) |
| 3 | 250 | |
| 1 | 112 | |
| 2 | 84 | 18.8% |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 244 | |
| 1 | 121 | |
| 2 | 81 | 18.2% |
| Value | Count | Frequency (%) |
| 3 | 250 | |
| 1 | 112 | |
| 2 | 84 | 18.8% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 244 | |
| 1 | 121 | |
| 2 | 81 | 18.2% |
| Value | Count | Frequency (%) |
| 3 | 250 | |
| 1 | 112 | |
| 2 | 84 | 18.8% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 244 | |
| 1 | 121 | |
| 2 | 81 | 18.2% |
| Value | Count | Frequency (%) |
| 3 | 250 | |
| 1 | 112 | |
| 2 | 84 | 18.8% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 446 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 244 | |
| 1 | 121 | |
| 2 | 81 | 18.2% |
| Value | Count | Frequency (%) |
| 3 | 250 | |
| 1 | 112 | |
| 2 | 84 | 18.8% |
Name
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 446 | 446 |
| Distinct (%) | 100.0% | 100.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 82 | 61 |
| Median length | 48 | 49 |
| Mean length | 27.320628 | 27.026906 |
| Min length | 15 | 12 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 12185 | 12054 |
| Distinct characters | 59 | 59 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 446 | 446 ? |
| Unique (%) | 100.0% | 100.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | Chibnall, Mrs. (Edith Martha Bowerman) | Reuchlin, Jonkheer. John George |
| 2nd row | Jussila, Miss. Mari Aina | Cunningham, Mr. Alfred Fleming |
| 3rd row | Soholt, Mr. Peter Andreas Lauritz Andersen | Chip, Mr. Chang |
| 4th row | Hart, Miss. Eva Miriam | Coelho, Mr. Domingos Fernandeo |
| 5th row | Hagland, Mr. Ingvald Olai Olsen | Sirota, Mr. Maurice |
| Value | Count | Frequency (%) |
| mr | 256 | 14.0% |
| miss | 92 | 5.0% |
| mrs | 66 | 3.6% |
| william | 27 | 1.5% |
| john | 22 | 1.2% |
| master | 20 | 1.1% |
| george | 14 | 0.8% |
| james | 14 | 0.8% |
| henry | 14 | 0.8% |
| thomas | 13 | 0.7% |
| Other values (926) | 1291 |
| Value | Count | Frequency (%) |
| mr | 267 | 14.7% |
| miss | 92 | 5.1% |
| mrs | 58 | 3.2% |
| william | 31 | 1.7% |
| john | 20 | 1.1% |
| henry | 20 | 1.1% |
| master | 19 | 1.0% |
| james | 13 | 0.7% |
| edward | 12 | 0.7% |
| thomas | 11 | 0.6% |
| Other values (912) | 1273 |
Most occurring characters
| Value | Count | Frequency (%) |
| 1383 | 11.4% | |
| r | 986 | 8.1% |
| a | 847 | 7.0% |
| e | 833 | 6.8% |
| i | 695 | 5.7% |
| n | 660 | 5.4% |
| s | 659 | 5.4% |
| M | 572 | 4.7% |
| l | 522 | 4.3% |
| o | 503 | 4.1% |
| Other values (49) | 4525 |
| Value | Count | Frequency (%) |
| 1371 | 11.4% | |
| r | 992 | 8.2% |
| e | 883 | 7.3% |
| a | 812 | 6.7% |
| i | 683 | 5.7% |
| n | 649 | 5.4% |
| s | 624 | 5.2% |
| M | 557 | 4.6% |
| l | 541 | 4.5% |
| o | 486 | 4.0% |
| Other values (49) | 4456 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 12185 |
| Value | Count | Frequency (%) |
| (unknown) | 12054 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 1383 | 11.4% | |
| r | 986 | 8.1% |
| a | 847 | 7.0% |
| e | 833 | 6.8% |
| i | 695 | 5.7% |
| n | 660 | 5.4% |
| s | 659 | 5.4% |
| M | 572 | 4.7% |
| l | 522 | 4.3% |
| o | 503 | 4.1% |
| Other values (49) | 4525 |
| Value | Count | Frequency (%) |
| 1371 | 11.4% | |
| r | 992 | 8.2% |
| e | 883 | 7.3% |
| a | 812 | 6.7% |
| i | 683 | 5.7% |
| n | 649 | 5.4% |
| s | 624 | 5.2% |
| M | 557 | 4.6% |
| l | 541 | 4.5% |
| o | 486 | 4.0% |
| Other values (49) | 4456 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 12185 |
| Value | Count | Frequency (%) |
| (unknown) | 12054 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 1383 | 11.4% | |
| r | 986 | 8.1% |
| a | 847 | 7.0% |
| e | 833 | 6.8% |
| i | 695 | 5.7% |
| n | 660 | 5.4% |
| s | 659 | 5.4% |
| M | 572 | 4.7% |
| l | 522 | 4.3% |
| o | 503 | 4.1% |
| Other values (49) | 4525 |
| Value | Count | Frequency (%) |
| 1371 | 11.4% | |
| r | 992 | 8.2% |
| e | 883 | 7.3% |
| a | 812 | 6.7% |
| i | 683 | 5.7% |
| n | 649 | 5.4% |
| s | 624 | 5.2% |
| M | 557 | 4.6% |
| l | 541 | 4.5% |
| o | 486 | 4.0% |
| Other values (49) | 4456 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 12185 |
| Value | Count | Frequency (%) |
| (unknown) | 12054 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 1383 | 11.4% | |
| r | 986 | 8.1% |
| a | 847 | 7.0% |
| e | 833 | 6.8% |
| i | 695 | 5.7% |
| n | 660 | 5.4% |
| s | 659 | 5.4% |
| M | 572 | 4.7% |
| l | 522 | 4.3% |
| o | 503 | 4.1% |
| Other values (49) | 4525 |
| Value | Count | Frequency (%) |
| 1371 | 11.4% | |
| r | 992 | 8.2% |
| e | 883 | 7.3% |
| a | 812 | 6.7% |
| i | 683 | 5.7% |
| n | 649 | 5.4% |
| s | 624 | 5.2% |
| M | 557 | 4.6% |
| l | 541 | 4.5% |
| o | 486 | 4.0% |
| Other values (49) | 4456 |
Sex
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 2 | 2 |
| Distinct (%) | 0.4% | 0.4% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| male | |
|---|---|
| female |
| male | |
|---|---|
| female |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 6 | 6 |
| Median length | 4 | 4 |
| Mean length | 4.7219731 | 4.6860987 |
| Min length | 4 | 4 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 2106 | 2090 |
| Distinct characters | 5 | 5 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | female | male |
| 2nd row | female | male |
| 3rd row | male | male |
| 4th row | female | male |
| 5th row | male | male |
Common Values
| Value | Count | Frequency (%) |
| male | 285 | |
| female | 161 |
| Value | Count | Frequency (%) |
| male | 293 | |
| female | 153 |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| male | 285 | |
| female | 161 |
| Value | Count | Frequency (%) |
| male | 293 | |
| female | 153 |
Most occurring characters
| Value | Count | Frequency (%) |
| e | 607 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 161 | 7.6% |
| Value | Count | Frequency (%) |
| e | 599 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 153 | 7.3% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 2106 |
| Value | Count | Frequency (%) |
| (unknown) | 2090 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| e | 607 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 161 | 7.6% |
| Value | Count | Frequency (%) |
| e | 599 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 153 | 7.3% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 2106 |
| Value | Count | Frequency (%) |
| (unknown) | 2090 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| e | 607 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 161 | 7.6% |
| Value | Count | Frequency (%) |
| e | 599 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 153 | 7.3% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 2106 |
| Value | Count | Frequency (%) |
| (unknown) | 2090 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| e | 607 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 161 | 7.6% |
| Value | Count | Frequency (%) |
| e | 599 | |
| m | 446 | |
| a | 446 | |
| l | 446 | |
| f | 153 | 7.3% |
Age
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 73 | 73 |
| Distinct (%) | 20.8% | 19.4% |
| Missing | 95 | 70 |
| Missing (%) | 21.3% | 15.7% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 29.534444 | 29.184628 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.92 |
| Maximum | 71 | 80 |
| Zeros | 0 | 0 |
| Zeros (%) | 0.0% | 0.0% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0.42 | 0.92 |
| 5-th percentile | 4 | 5 |
| Q1 | 21 | 20 |
| median | 28 | 28 |
| Q3 | 38 | 36 |
| 95-th percentile | 55.5 | 54 |
| Maximum | 71 | 80 |
| Range | 70.58 | 79.08 |
| Interquartile range (IQR) | 17 | 16 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 14.38638 | 14.008752 |
| Coefficient of variation (CV) | 0.48710515 | 0.48000446 |
| Kurtosis | 0.12985517 | 0.51525078 |
| Mean | 29.534444 | 29.184628 |
| Median Absolute Deviation (MAD) | 8 | 8 |
| Skewness | 0.34331607 | 0.46383981 |
| Sum | 10366.59 | 10973.42 |
| Variance | 206.96793 | 196.24512 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 28 | 16 | 3.6% |
| 30 | 15 | 3.4% |
| 24 | 14 | 3.1% |
| 19 | 13 | 2.9% |
| 21 | 13 | 2.9% |
| 33 | 12 | 2.7% |
| 22 | 12 | 2.7% |
| 29 | 11 | 2.5% |
| 36 | 11 | 2.5% |
| 18 | 11 | 2.5% |
| Other values (63) | 223 | |
| (Missing) | 95 |
| Value | Count | Frequency (%) |
| 29 | 18 | 4.0% |
| 22 | 17 | 3.8% |
| 24 | 17 | 3.8% |
| 32 | 14 | 3.1% |
| 18 | 14 | 3.1% |
| 19 | 14 | 3.1% |
| 21 | 13 | 2.9% |
| 28 | 12 | 2.7% |
| 36 | 12 | 2.7% |
| 34 | 11 | 2.5% |
| Other values (63) | 234 | |
| (Missing) | 70 | 15.7% |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 1 | 3 | |
| 2 | 4 | |
| 3 | 5 | |
| 4 | 5 | |
| 5 | 3 | |
| 6 | 1 | 0.2% |
| 7 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0.92 | 1 | 0.2% |
| 1 | 3 | 0.7% |
| 2 | 8 | |
| 3 | 3 | 0.7% |
| 4 | 3 | 0.7% |
| 5 | 3 | 0.7% |
| 6 | 2 | 0.4% |
| 7 | 3 | 0.7% |
| 8 | 2 | 0.4% |
| 9 | 4 |
| Value | Count | Frequency (%) |
| 0.92 | 1 | 0.2% |
| 1 | 3 | 0.7% |
| 2 | 8 | |
| 3 | 3 | 0.7% |
| 4 | 3 | 0.7% |
| 5 | 3 | 0.7% |
| 6 | 2 | 0.4% |
| 7 | 3 | 0.7% |
| 8 | 2 | 0.4% |
| 9 | 4 |
| Value | Count | Frequency (%) |
| 0.42 | 1 | 0.2% |
| 0.67 | 1 | 0.2% |
| 0.75 | 2 | 0.4% |
| 1 | 3 | |
| 2 | 4 | |
| 3 | 5 | |
| 4 | 5 | |
| 5 | 3 | |
| 6 | 1 | 0.2% |
| 7 | 1 | 0.2% |
SibSp
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 7 |
| Distinct (%) | 1.6% | 1.6% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.48206278 | 0.48654709 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 8 | 8 |
| Zeros | 305 | 317 |
| Zeros (%) | 68.4% | 71.1% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 1 | 1 |
| 95-th percentile | 2 | 2 |
| Maximum | 8 | 8 |
| Range | 8 | 8 |
| Interquartile range (IQR) | 1 | 1 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.98455089 | 1.0927949 |
| Coefficient of variation (CV) | 2.0423707 | 2.2460208 |
| Kurtosis | 18.567066 | 17.783857 |
| Mean | 0.48206278 | 0.48654709 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 3.64063 | 3.749945 |
| Sum | 215 | 217 |
| Variance | 0.96934045 | 1.1942006 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 305 | |
| 1 | 108 | 24.2% |
| 2 | 15 | 3.4% |
| 4 | 9 | 2.0% |
| 3 | 5 | 1.1% |
| 8 | 2 | 0.4% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 317 | |
| 1 | 96 | 21.5% |
| 2 | 11 | 2.5% |
| 4 | 10 | 2.2% |
| 3 | 5 | 1.1% |
| 5 | 4 | 0.9% |
| 8 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 305 | |
| 1 | 108 | 24.2% |
| 2 | 15 | 3.4% |
| 3 | 5 | 1.1% |
| 4 | 9 | 2.0% |
| 5 | 2 | 0.4% |
| 8 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 317 | |
| 1 | 96 | 21.5% |
| 2 | 11 | 2.5% |
| 3 | 5 | 1.1% |
| 4 | 10 | 2.2% |
| 5 | 4 | 0.9% |
| 8 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 317 | |
| 1 | 96 | 21.5% |
| 2 | 11 | 2.5% |
| 3 | 5 | 1.1% |
| 4 | 10 | 2.2% |
| 5 | 4 | 0.9% |
| 8 | 3 | 0.7% |
| Value | Count | Frequency (%) |
| 0 | 305 | |
| 1 | 108 | 24.2% |
| 2 | 15 | 3.4% |
| 3 | 5 | 1.1% |
| 4 | 9 | 2.0% |
| 5 | 2 | 0.4% |
| 8 | 2 | 0.4% |
Parch
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 7 | 6 |
| Distinct (%) | 1.6% | 1.3% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 0.37443946 | 0.35426009 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 6 | 5 |
| Zeros | 341 | 347 |
| Zeros (%) | 76.5% | 77.8% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 0 | 0 |
| Q1 | 0 | 0 |
| median | 0 | 0 |
| Q3 | 0 | 0 |
| 95-th percentile | 2 | 2 |
| Maximum | 6 | 5 |
| Range | 6 | 5 |
| Interquartile range (IQR) | 0 | 0 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 0.79105762 | 0.77027196 |
| Coefficient of variation (CV) | 2.1126449 | 2.174312 |
| Kurtosis | 10.061917 | 8.7473145 |
| Mean | 0.37443946 | 0.35426009 |
| Median Absolute Deviation (MAD) | 0 | 0 |
| Skewness | 2.7304394 | 2.674933 |
| Sum | 167 | 158 |
| Variance | 0.62577216 | 0.59331889 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 0 | 341 | |
| 1 | 57 | 12.8% |
| 2 | 41 | 9.2% |
| 3 | 3 | 0.7% |
| 4 | 2 | 0.4% |
| 5 | 1 | 0.2% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 347 | |
| 1 | 53 | 11.9% |
| 2 | 40 | 9.0% |
| 4 | 3 | 0.7% |
| 5 | 2 | 0.4% |
| 3 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 341 | |
| 1 | 57 | 12.8% |
| 2 | 41 | 9.2% |
| 3 | 3 | 0.7% |
| 4 | 2 | 0.4% |
| 5 | 1 | 0.2% |
| 6 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 347 | |
| 1 | 53 | 11.9% |
| 2 | 40 | 9.0% |
| 3 | 1 | 0.2% |
| 4 | 3 | 0.7% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 347 | |
| 1 | 53 | 11.9% |
| 2 | 40 | 9.0% |
| 3 | 1 | 0.2% |
| 4 | 3 | 0.7% |
| 5 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 341 | |
| 1 | 57 | 12.8% |
| 2 | 41 | 9.2% |
| 3 | 3 | 0.7% |
| 4 | 2 | 0.4% |
| 5 | 1 | 0.2% |
| 6 | 1 | 0.2% |
Ticket
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 380 | 384 |
| Distinct (%) | 85.2% | 86.1% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 18 | 18 |
| Median length | 17 | 17 |
| Mean length | 6.7466368 | 6.8049327 |
| Min length | 3 | 3 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 3009 | 3035 |
| Distinct characters | 31 | 32 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 326 | 339 ? |
| Unique (%) | 73.1% | 76.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | 113505 | 19972 |
| 2nd row | 4137 | 239853 |
| 3rd row | 348124 | 1601 |
| 4th row | F.C.C. 13529 | SOTON/O.Q. 3101307 |
| 5th row | 65303 | 392092 |
| Value | Count | Frequency (%) |
| pc | 31 | 5.5% |
| c.a | 11 | 2.0% |
| 2 | 8 | 1.4% |
| ston/o | 8 | 1.4% |
| a/5 | 6 | 1.1% |
| sc/paris | 6 | 1.1% |
| ca | 6 | 1.1% |
| 347082 | 4 | 0.7% |
| 2666 | 4 | 0.7% |
| s.o.c | 4 | 0.7% |
| Other values (399) | 476 |
| Value | Count | Frequency (%) |
| pc | 31 | 5.4% |
| c.a | 13 | 2.3% |
| ca | 8 | 1.4% |
| ston/o | 8 | 1.4% |
| 2 | 8 | 1.4% |
| a/5 | 8 | 1.4% |
| 1601 | 5 | 0.9% |
| soton/o.q | 5 | 0.9% |
| w./c | 4 | 0.7% |
| sc/paris | 4 | 0.7% |
| Other values (403) | 475 |
Most occurring characters
| Value | Count | Frequency (%) |
| 3 | 382 | |
| 1 | 339 | |
| 2 | 283 | |
| 4 | 255 | |
| 7 | 243 | |
| 6 | 214 | 7.1% |
| 0 | 197 | 6.5% |
| 5 | 189 | 6.3% |
| 9 | 172 | 5.7% |
| 8 | 133 | 4.4% |
| Other values (21) | 602 |
| Value | Count | Frequency (%) |
| 3 | 372 | |
| 1 | 360 | |
| 2 | 297 | |
| 7 | 239 | 7.9% |
| 4 | 226 | 7.4% |
| 6 | 207 | 6.8% |
| 0 | 204 | 6.7% |
| 5 | 184 | 6.1% |
| 9 | 181 | 6.0% |
| 8 | 149 | 4.9% |
| Other values (22) | 616 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 3009 |
| Value | Count | Frequency (%) |
| (unknown) | 3035 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| 3 | 382 | |
| 1 | 339 | |
| 2 | 283 | |
| 4 | 255 | |
| 7 | 243 | |
| 6 | 214 | 7.1% |
| 0 | 197 | 6.5% |
| 5 | 189 | 6.3% |
| 9 | 172 | 5.7% |
| 8 | 133 | 4.4% |
| Other values (21) | 602 |
| Value | Count | Frequency (%) |
| 3 | 372 | |
| 1 | 360 | |
| 2 | 297 | |
| 7 | 239 | 7.9% |
| 4 | 226 | 7.4% |
| 6 | 207 | 6.8% |
| 0 | 204 | 6.7% |
| 5 | 184 | 6.1% |
| 9 | 181 | 6.0% |
| 8 | 149 | 4.9% |
| Other values (22) | 616 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 3009 |
| Value | Count | Frequency (%) |
| (unknown) | 3035 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| 3 | 382 | |
| 1 | 339 | |
| 2 | 283 | |
| 4 | 255 | |
| 7 | 243 | |
| 6 | 214 | 7.1% |
| 0 | 197 | 6.5% |
| 5 | 189 | 6.3% |
| 9 | 172 | 5.7% |
| 8 | 133 | 4.4% |
| Other values (21) | 602 |
| Value | Count | Frequency (%) |
| 3 | 372 | |
| 1 | 360 | |
| 2 | 297 | |
| 7 | 239 | 7.9% |
| 4 | 226 | 7.4% |
| 6 | 207 | 6.8% |
| 0 | 204 | 6.7% |
| 5 | 184 | 6.1% |
| 9 | 181 | 6.0% |
| 8 | 149 | 4.9% |
| Other values (22) | 616 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 3009 |
| Value | Count | Frequency (%) |
| (unknown) | 3035 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| 3 | 382 | |
| 1 | 339 | |
| 2 | 283 | |
| 4 | 255 | |
| 7 | 243 | |
| 6 | 214 | 7.1% |
| 0 | 197 | 6.5% |
| 5 | 189 | 6.3% |
| 9 | 172 | 5.7% |
| 8 | 133 | 4.4% |
| Other values (21) | 602 |
| Value | Count | Frequency (%) |
| 3 | 372 | |
| 1 | 360 | |
| 2 | 297 | |
| 7 | 239 | 7.9% |
| 4 | 226 | 7.4% |
| 6 | 207 | 6.8% |
| 0 | 204 | 6.7% |
| 5 | 184 | 6.1% |
| 9 | 181 | 6.0% |
| 8 | 149 | 4.9% |
| Other values (22) | 616 |
Fare
Real number (ℝ)
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 187 | 183 |
| Distinct (%) | 41.9% | 41.0% |
| Missing | 0 | 0 |
| Missing (%) | 0.0% | 0.0% |
| Infinite | 0 | 0 |
| Infinite (%) | 0.0% | 0.0% |
| Mean | 31.707006 | 31.233865 |
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| Maximum | 262.375 | 512.3292 |
| Zeros | 9 | 8 |
| Zeros (%) | 2.0% | 1.8% |
| Negative | 0 | 0 |
| Negative (%) | 0.0% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
Quantile statistics
| Dataset A | Dataset B | |
|---|---|---|
| Minimum | 0 | 0 |
| 5-th percentile | 7.225 | 7.05105 |
| Q1 | 7.8958 | 7.925 |
| median | 14.75 | 13 |
| Q3 | 34.28645 | 32.875 |
| 95-th percentile | 105.05 | 90.8094 |
| Maximum | 262.375 | 512.3292 |
| Range | 262.375 | 512.3292 |
| Interquartile range (IQR) | 26.39065 | 24.95 |
Descriptive statistics
| Dataset A | Dataset B | |
|---|---|---|
| Standard deviation | 41.724491 | 45.850089 |
| Coefficient of variation (CV) | 1.3159392 | 1.4679608 |
| Kurtosis | 11.00823 | 33.155174 |
| Mean | 31.707006 | 31.233865 |
| Median Absolute Deviation (MAD) | 7.2417 | 5.775 |
| Skewness | 3.0279947 | 4.6348444 |
| Sum | 14141.325 | 13930.304 |
| Variance | 1740.9332 | 2102.2307 |
| Monotonicity | Not monotonic | Not monotonic |
| Value | Count | Frequency (%) |
| 7.8958 | 22 | 4.9% |
| 13 | 22 | 4.9% |
| 7.75 | 19 | 4.3% |
| 8.05 | 18 | 4.0% |
| 26 | 11 | 2.5% |
| 7.225 | 9 | 2.0% |
| 0 | 9 | 2.0% |
| 7.925 | 9 | 2.0% |
| 8.6625 | 8 | 1.8% |
| 7.775 | 8 | 1.8% |
| Other values (177) | 311 |
| Value | Count | Frequency (%) |
| 8.05 | 26 | 5.8% |
| 13 | 21 | 4.7% |
| 7.75 | 18 | 4.0% |
| 7.8958 | 17 | 3.8% |
| 26 | 12 | 2.7% |
| 10.5 | 11 | 2.5% |
| 7.925 | 11 | 2.5% |
| 7.775 | 9 | 2.0% |
| 0 | 8 | 1.8% |
| 8.6625 | 8 | 1.8% |
| Other values (173) | 305 |
| Value | Count | Frequency (%) |
| 0 | 9 | |
| 5 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 5 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 2 | 0.4% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 8 | |
| 5 | 1 | 0.2% |
| 6.2375 | 1 | 0.2% |
| 6.4375 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 2 | 0.4% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| Value | Count | Frequency (%) |
| 0 | 9 | |
| 5 | 1 | 0.2% |
| 6.45 | 1 | 0.2% |
| 6.4958 | 1 | 0.2% |
| 6.75 | 1 | 0.2% |
| 6.8583 | 1 | 0.2% |
| 6.95 | 1 | 0.2% |
| 6.975 | 1 | 0.2% |
| 7.0458 | 1 | 0.2% |
| 7.05 | 2 | 0.4% |
Cabin
['Text', 'Text']
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 94 | 81 |
| Distinct (%) | 81.7% | 77.9% |
| Missing | 331 | 342 |
| Missing (%) | 74.2% | 76.7% |
| Memory size | 7.0 KiB | 7.0 KiB |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 15 | 15 |
| Median length | 3 | 3 |
| Mean length | 3.4956522 | 3.6538462 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 402 | 380 |
| Distinct characters | 18 | 19 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 74 | 61 ? |
| Unique (%) | 64.3% | 58.7% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | E33 | C106 |
| 2nd row | F G73 | C22 C26 |
| 3rd row | C22 C26 | B35 |
| 4th row | B102 | F33 |
| 5th row | E101 | C2 |
| Value | Count | Frequency (%) |
| g6 | 3 | 2.3% |
| e8 | 2 | 1.5% |
| d36 | 2 | 1.5% |
| b98 | 2 | 1.5% |
| b96 | 2 | 1.5% |
| c126 | 2 | 1.5% |
| d35 | 2 | 1.5% |
| e67 | 2 | 1.5% |
| c78 | 2 | 1.5% |
| b60 | 2 | 1.5% |
| Other values (95) | 110 |
| Value | Count | Frequency (%) |
| g6 | 3 | 2.4% |
| b96 | 3 | 2.4% |
| b98 | 3 | 2.4% |
| f33 | 3 | 2.4% |
| d33 | 2 | 1.6% |
| b22 | 2 | 1.6% |
| c26 | 2 | 1.6% |
| c22 | 2 | 1.6% |
| g73 | 2 | 1.6% |
| f | 2 | 1.6% |
| Other values (84) | 100 |
Most occurring characters
| Value | Count | Frequency (%) |
| B | 41 | 10.2% |
| 6 | 38 | 9.5% |
| 3 | 34 | 8.5% |
| 1 | 33 | 8.2% |
| C | 33 | 8.2% |
| 2 | 33 | 8.2% |
| 5 | 24 | 6.0% |
| 7 | 19 | 4.7% |
| E | 19 | 4.7% |
| 4 | 19 | 4.7% |
| Other values (8) | 109 |
| Value | Count | Frequency (%) |
| B | 44 | |
| 2 | 40 | |
| 3 | 31 | 8.2% |
| 1 | 31 | 8.2% |
| C | 30 | 7.9% |
| 5 | 24 | 6.3% |
| 8 | 22 | 5.8% |
| 6 | 22 | 5.8% |
| 20 | 5.3% | |
| D | 20 | 5.3% |
| Other values (9) | 96 |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 402 |
| Value | Count | Frequency (%) |
| (unknown) | 380 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| B | 41 | 10.2% |
| 6 | 38 | 9.5% |
| 3 | 34 | 8.5% |
| 1 | 33 | 8.2% |
| C | 33 | 8.2% |
| 2 | 33 | 8.2% |
| 5 | 24 | 6.0% |
| 7 | 19 | 4.7% |
| E | 19 | 4.7% |
| 4 | 19 | 4.7% |
| Other values (8) | 109 |
| Value | Count | Frequency (%) |
| B | 44 | |
| 2 | 40 | |
| 3 | 31 | 8.2% |
| 1 | 31 | 8.2% |
| C | 30 | 7.9% |
| 5 | 24 | 6.3% |
| 8 | 22 | 5.8% |
| 6 | 22 | 5.8% |
| 20 | 5.3% | |
| D | 20 | 5.3% |
| Other values (9) | 96 |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 402 |
| Value | Count | Frequency (%) |
| (unknown) | 380 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| B | 41 | 10.2% |
| 6 | 38 | 9.5% |
| 3 | 34 | 8.5% |
| 1 | 33 | 8.2% |
| C | 33 | 8.2% |
| 2 | 33 | 8.2% |
| 5 | 24 | 6.0% |
| 7 | 19 | 4.7% |
| E | 19 | 4.7% |
| 4 | 19 | 4.7% |
| Other values (8) | 109 |
| Value | Count | Frequency (%) |
| B | 44 | |
| 2 | 40 | |
| 3 | 31 | 8.2% |
| 1 | 31 | 8.2% |
| C | 30 | 7.9% |
| 5 | 24 | 6.3% |
| 8 | 22 | 5.8% |
| 6 | 22 | 5.8% |
| 20 | 5.3% | |
| D | 20 | 5.3% |
| Other values (9) | 96 |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 402 |
| Value | Count | Frequency (%) |
| (unknown) | 380 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| B | 41 | 10.2% |
| 6 | 38 | 9.5% |
| 3 | 34 | 8.5% |
| 1 | 33 | 8.2% |
| C | 33 | 8.2% |
| 2 | 33 | 8.2% |
| 5 | 24 | 6.0% |
| 7 | 19 | 4.7% |
| E | 19 | 4.7% |
| 4 | 19 | 4.7% |
| Other values (8) | 109 |
| Value | Count | Frequency (%) |
| B | 44 | |
| 2 | 40 | |
| 3 | 31 | 8.2% |
| 1 | 31 | 8.2% |
| C | 30 | 7.9% |
| 5 | 24 | 6.3% |
| 8 | 22 | 5.8% |
| 6 | 22 | 5.8% |
| 20 | 5.3% | |
| D | 20 | 5.3% |
| Other values (9) | 96 |
Embarked
Categorical
| Dataset A | Dataset B | |
|---|---|---|
| Distinct | 3 | 3 |
| Distinct (%) | 0.7% | 0.7% |
| Missing | 1 | 0 |
| Missing (%) | 0.2% | 0.0% |
| Memory size | 7.0 KiB | 7.0 KiB |
| S | |
|---|---|
| C | |
| Q |
| S | |
|---|---|
| C | |
| Q |
Length
| Dataset A | Dataset B | |
|---|---|---|
| Max length | 1 | 1 |
| Median length | 1 | 1 |
| Mean length | 1 | 1 |
| Min length | 1 | 1 |
Characters and Unicode
| Dataset A | Dataset B | |
|---|---|---|
| Total characters | 445 | 446 |
| Distinct characters | 3 | 3 |
| Distinct categories | 1 | 1 ? |
| Distinct scripts | 1 | 1 ? |
| Distinct blocks | 1 | 1 ? |
Unique
| Dataset A | Dataset B | |
|---|---|---|
| Unique | 0 | 0 ? |
| Unique (%) | 0.0% | 0.0% |
Sample
| Dataset A | Dataset B | |
|---|---|---|
| 1st row | S | S |
| 2nd row | S | S |
| 3rd row | S | S |
| 4th row | S | S |
| 5th row | S | S |
Common Values
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 92 | 20.6% |
| Q | 44 | 9.9% |
| (Missing) | 1 | 0.2% |
| Value | Count | Frequency (%) |
| S | 334 | |
| C | 76 | 17.0% |
| Q | 36 | 8.1% |
Length
Common Values (Plot)
Dataset A
Dataset B
| Value | Count | Frequency (%) |
| s | 309 | |
| c | 92 | 20.7% |
| q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| s | 334 | |
| c | 76 | 17.0% |
| q | 36 | 8.1% |
Most occurring characters
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 92 | 20.7% |
| Q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| S | 334 | |
| C | 76 | 17.0% |
| Q | 36 | 8.1% |
Most occurring categories
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per category
(unknown)
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 92 | 20.7% |
| Q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| S | 334 | |
| C | 76 | 17.0% |
| Q | 36 | 8.1% |
Most occurring scripts
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per script
(unknown)
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 92 | 20.7% |
| Q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| S | 334 | |
| C | 76 | 17.0% |
| Q | 36 | 8.1% |
Most occurring blocks
| Value | Count | Frequency (%) |
| (unknown) | 445 |
| Value | Count | Frequency (%) |
| (unknown) | 446 |
Most frequent character per block
(unknown)
| Value | Count | Frequency (%) |
| S | 309 | |
| C | 92 | 20.7% |
| Q | 44 | 9.9% |
| Value | Count | Frequency (%) |
| S | 334 | |
| C | 76 | 17.0% |
| Q | 36 | 8.1% |
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
Dataset B
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 166 | 167 | 1 | 1 | Chibnall, Mrs. (Edith Martha Bowerman) | female | NaN | 0 | 1 | 113505 | 55.0000 | E33 | S |
| 402 | 403 | 0 | 3 | Jussila, Miss. Mari Aina | female | 21.0 | 1 | 0 | 4137 | 9.8250 | NaN | S |
| 715 | 716 | 0 | 3 | Soholt, Mr. Peter Andreas Lauritz Andersen | male | 19.0 | 0 | 0 | 348124 | 7.6500 | F G73 | S |
| 535 | 536 | 1 | 2 | Hart, Miss. Eva Miriam | female | 7.0 | 0 | 2 | F.C.C. 13529 | 26.2500 | NaN | S |
| 451 | 452 | 0 | 3 | Hagland, Mr. Ingvald Olai Olsen | male | NaN | 1 | 0 | 65303 | 19.9667 | NaN | S |
| 778 | 779 | 0 | 3 | Kilgannon, Mr. Thomas J | male | NaN | 0 | 0 | 36865 | 7.7375 | NaN | Q |
| 91 | 92 | 0 | 3 | Andreasson, Mr. Paul Edvin | male | 20.0 | 0 | 0 | 347466 | 7.8542 | NaN | S |
| 498 | 499 | 0 | 1 | Allison, Mrs. Hudson J C (Bessie Waldo Daniels) | female | 25.0 | 1 | 2 | 113781 | 151.5500 | C22 C26 | S |
| 508 | 509 | 0 | 3 | Olsen, Mr. Henry Margido | male | 28.0 | 0 | 0 | C 4001 | 22.5250 | NaN | S |
| 815 | 816 | 0 | 1 | Fry, Mr. Richard | male | NaN | 0 | 0 | 112058 | 0.0000 | B102 | S |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 822 | 823 | 0 | 1 | Reuchlin, Jonkheer. John George | male | 38.0 | 0 | 0 | 19972 | 0.0000 | NaN | S |
| 413 | 414 | 0 | 2 | Cunningham, Mr. Alfred Fleming | male | NaN | 0 | 0 | 239853 | 0.0000 | NaN | S |
| 838 | 839 | 1 | 3 | Chip, Mr. Chang | male | 32.0 | 0 | 0 | 1601 | 56.4958 | NaN | S |
| 131 | 132 | 0 | 3 | Coelho, Mr. Domingos Fernandeo | male | 20.0 | 0 | 0 | SOTON/O.Q. 3101307 | 7.0500 | NaN | S |
| 837 | 838 | 0 | 3 | Sirota, Mr. Maurice | male | NaN | 0 | 0 | 392092 | 8.0500 | NaN | S |
| 298 | 299 | 1 | 1 | Saalfeld, Mr. Adolphe | male | NaN | 0 | 0 | 19988 | 30.5000 | C106 | S |
| 297 | 298 | 0 | 1 | Allison, Miss. Helen Loraine | female | 2.0 | 1 | 2 | 113781 | 151.5500 | C22 C26 | S |
| 369 | 370 | 1 | 1 | Aubart, Mme. Leontine Pauline | female | 24.0 | 0 | 0 | PC 17477 | 69.3000 | B35 | C |
| 172 | 173 | 1 | 3 | Johnson, Miss. Eleanor Ileen | female | 1.0 | 1 | 1 | 347742 | 11.1333 | NaN | S |
| 39 | 40 | 1 | 3 | Nicola-Yarred, Miss. Jamila | female | 14.0 | 1 | 0 | 2651 | 11.2417 | NaN | C |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 314 | 315 | 0 | 2 | Hart, Mr. Benjamin | male | 43.0 | 1 | 1 | F.C.C. 13529 | 26.2500 | NaN | S |
| 283 | 284 | 1 | 3 | Dorking, Mr. Edward Arthur | male | 19.0 | 0 | 0 | A/5. 10482 | 8.0500 | NaN | S |
| 492 | 493 | 0 | 1 | Molson, Mr. Harry Markland | male | 55.0 | 0 | 0 | 113787 | 30.5000 | C30 | S |
| 86 | 87 | 0 | 3 | Ford, Mr. William Neal | male | 16.0 | 1 | 3 | W./C. 6608 | 34.3750 | NaN | S |
| 709 | 710 | 1 | 3 | Moubarek, Master. Halim Gonios ("William George") | male | NaN | 1 | 1 | 2661 | 15.2458 | NaN | C |
| 381 | 382 | 1 | 3 | Nakid, Miss. Maria ("Mary") | female | 1.0 | 0 | 2 | 2653 | 15.7417 | NaN | C |
| 840 | 841 | 0 | 3 | Alhomaki, Mr. Ilmari Rudolf | male | 20.0 | 0 | 0 | SOTON/O2 3101287 | 7.9250 | NaN | S |
| 186 | 187 | 1 | 3 | O'Brien, Mrs. Thomas (Johanna "Hannah" Godfrey) | female | NaN | 1 | 0 | 370365 | 15.5000 | NaN | Q |
| 412 | 413 | 1 | 1 | Minahan, Miss. Daisy E | female | 33.0 | 1 | 0 | 19928 | 90.0000 | C78 | Q |
| 280 | 281 | 0 | 3 | Duane, Mr. Frank | male | 65.0 | 0 | 0 | 336439 | 7.7500 | NaN | Q |
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 677 | 678 | 1 | 3 | Turja, Miss. Anna Sofia | female | 18.0 | 0 | 0 | 4138 | 9.8417 | NaN | S |
| 210 | 211 | 0 | 3 | Ali, Mr. Ahmed | male | 24.0 | 0 | 0 | SOTON/O.Q. 3101311 | 7.0500 | NaN | S |
| 196 | 197 | 0 | 3 | Mernagh, Mr. Robert | male | NaN | 0 | 0 | 368703 | 7.7500 | NaN | Q |
| 360 | 361 | 0 | 3 | Skoog, Mr. Wilhelm | male | 40.0 | 1 | 4 | 347088 | 27.9000 | NaN | S |
| 617 | 618 | 0 | 3 | Lobb, Mrs. William Arthur (Cordelia K Stanlick) | female | 26.0 | 1 | 0 | A/5. 3336 | 16.1000 | NaN | S |
| 872 | 873 | 0 | 1 | Carlsson, Mr. Frans Olof | male | 33.0 | 0 | 0 | 695 | 5.0000 | B51 B53 B55 | S |
| 117 | 118 | 0 | 2 | Turpin, Mr. William John Robert | male | 29.0 | 1 | 0 | 11668 | 21.0000 | NaN | S |
| 840 | 841 | 0 | 3 | Alhomaki, Mr. Ilmari Rudolf | male | 20.0 | 0 | 0 | SOTON/O2 3101287 | 7.9250 | NaN | S |
| 777 | 778 | 1 | 3 | Emanuel, Miss. Virginia Ethel | female | 5.0 | 0 | 0 | 364516 | 12.4750 | NaN | S |
| 165 | 166 | 1 | 3 | Goldsmith, Master. Frank John William "Frankie" | male | 9.0 | 0 | 2 | 363291 | 20.5250 | NaN | S |
Dataset A
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||
Dataset B
| PassengerId | Survived | Pclass | Name | Sex | Age | SibSp | Parch | Ticket | Fare | Cabin | Embarked | # duplicates | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Dataset does not contain duplicate rows. | |||||||||||||